Aperiodicity control in ARX-based speech analysis-synthesis method

نویسندگان

  • Takahiro Ohtsuka
  • Hideki Kasuya
چکیده

We present an improved algorithm for a robust speech analysissynthesis method based on an auto-regressive with exogenous input (ARX) speech production model proposed previously. The speech analysis-synthesis method is capable of making an automatic estimation of vocal tract (formant) and voice source parameters from a speech utterance, generating accurate formant values even for very high-pitched voices. The improved algorithm presented in this paper incorporates aperiodic components included in the voice source signal, taking the dynamic nature of the speech production process into account. Perceptual experiments show that implementation of the aperiodic components in the analysis-synthesis is very effective in improving the perceived quality of synthetic speech, particularly for soft voices, typical of female voice quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT

A new control paradigm of source signals for high quality speech synthesis is introduced to handle a variety of speech quality, based on timefrequency analyses by the use of an instantaneous frequency and group delay. The proposed signal representation consists of a frequency domain aperiodicity measure and a time domain energy concentration measure to represent source attributes, which supplem...

متن کامل

An improved speech analysis-synthesis algorithm based on the autoregressive with exogenous input speech production model

Ding et al. have explored a novel pitch-synchronous speech analysis-synthesis method[1] based on an auto-regressive with exogenous input (ARX) speech production model. This method makes an automatic estimation of the vocal tract (formant) and voice source parameters from a speech utterance. This method, however, has suffered deficiencies in the analysis of a high-pitch voice and the introductio...

متن کامل

A Data-driven Approach to Source-forma

A data-driven formant-type TTS system is proposed. The formanttype speech synthesizer is one of the most promising architectures to enable flexible control of various voice qualities. By applying the ARX-based speech analysis method, source and formant parameters are automatically obtained. It is shown that a TTS system can be built by using the parameters, without requiring any heuristic rules...

متن کامل

Separation of Voiced Source Charac Transfer Function Characteristics Fo Analysis Based on Ar-h

A new method was developed for the separation of source and transfer function characteristics of speech sounds, with an aim of utilizing it to “flexible” speech synthesis. The method is based on representing source waveform by an HMM, and transfer function by the AR process (AR-HMM model). As compared to methods based on ARX model, where a parametric representation is assumed for source wavefor...

متن کامل

A Frequency Domain Approach to ARX-LF Voiced Speech Parameterization and Synthesis

The ARX-LF model interprets voiced speech as the an LF derivative glottal pulse exciting an all-pole vocal tract filter with an additional exogenous residual signal. It fully parameterizes the voice and has been shown to be useful for voice modification. Because time domain methods to determine the ARX-LF parameters from speech are very sensitive to the time placement of the analysis frame and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001